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Abstract. In this manuscript we develop a version of Szemeredi's regularity lemma that is suitable for 
analyzing multicolorings of complete graphs and directed graphs. In this, we follow the proof of Alon, 
Fischer, Krivelevich and M. Szegedy [Combinatorica 20(4) (2000) 451—476] who prove a similar result for 
graphs. 

The purpose is to extend classical results on dense hereditary properties, such as the speed of the property 
or edit distance, to the above-mentioned combinatorial objects. 

1. Introduction 

We develop a version of Szemeredi's regularity lemma that is suitable for analyzing multicolorings of 
complete graphs and directed graphs. In proving our theorems we use as our guide the proof given by Alon, 
Fischer, Krivelevich and M. Szegedy which proves a similar theorem in the case of graphs. Their idea 
is, when given a graph, G, they find an induced subgraph G' and two equipartitions, A of V{G) and A' of 
V{G'). The partitions A and A' have the same number of parts. Each part of A' is large and contained in 
some part of A, each pairwise density of the parts in A' is close to the density of the corresponding pair in 
A, but all pairs in A' are regular. Our goal is to find an induced copy of H in G. If enough of the pairs of 
parts in A have a sufficiently large density, we can apply the regularity lemma and Ramsey's theorem inside 
each of the parts of A slicing lemma ensures that the resulting subclusters (we call them miniclusters) 
are ready to witness the embedding of a graph H. 

In fact, this approach works for any combinatorial object that has a sufficiently similar type of regularity 
lemma. 

Outline of the paper: In the following subsections, we give basic definitions and the results we need for 
the graph version (Section 11.11) . the multicolor version (Section II. 2p and the digraph version (Section 11.31) . 
Section 11.41 gives the main result. In Section [2l we prove our main results for multicolored graphs and for 
directed graphs simultaneously - the main machinery depends very little on the combinatorial object to be 
studied. In Section [31 we apply our result to a specific problem related to edit distance. 

Definition 1.1. A partition A = {Vi : 1 < i < k} is an equipartition of a finite set if \Vi\ and |\^'| differ 
by at most 1 for all 1 < i < i' < k. A refinement of A is a partition B = {Vi j^ 1 < i < k,l < ji < 
such that Vi ~ Uj'=i ^i-ji f^'"' i — ^, ■ ■ ■ , k. The number of parts of a partition is its order. 

Just to ensure some technicalities, we prove that every equipartition can be refined into an equipartition. 

Proposition 1.2. Let A = {Vi : 1 < i < k} be an equipartition of a finite set and let £ be a positive integer, 
i l£ \Vi\, i = I, ■ . ■ ,k. There exists a refinement of A into k£ parts that is an equipartition. 

Proof. If all the Vi are the same size, it is clear that equipartitioning each will result in the equipartition we 
seek. Suppose the sizes of each Vi are s and s — 1 such that s = q£ + r ior r ^ {0, . . . , £ — 1}. It suffices to 
show \s/f\ and [{s — l)/£\ differ by at most one. 

If r 7^ 0, then [s/£] = g + 1 and [(s - l)/£\ = q. If r = 0, then \s/£] = q and [(s - 1)/£J = g - 1. □ 
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1.1. Graph version. A graph G is a pair (V, E) where F is a finite vertex set and E C (^). 

For disjoint vertex sets Vt, Vj, we denote e{Vi, Vj) to be number of edges with one endpoint in Vi and the 
other in Vj. The density of {Vi,Vj) is 

The density vector of the pair (14, V}) is simply 

We say the pair [Vi^Vj) is a 7-regular pair if C Vi and C Vj such that both \Vl\ > 7|T4| and 
m\ > j\V,\, then \d{V/,Vj) - d{V,V,)\ < 7. 

A partition {Vi, . . . , 14) of the vertex set of G, a graph on n vertices, is said to be a 7-regular partition 
if each of the following holds: 

• |iy,|-|F,||<lforalH,je{l,...,fc}. 



All but at most "fk of the pairs (V^, V^), 1 < i < j < k are 7-regular. 



A version of Szemeredi's lemma says the following: 

Theorem 1.3 (Szemeredi [6]). For every m and e > 0, there exists an integer M — M{m,e) with the 
following property. 

If G is a graph with n > M vertices, and A is an equipartition of the vertex set of G of order at most m, 
then there exists a refinement B of A of order k, where m < k < M , which is e-regular. 

There are two important lemmas cited by Alon, et al. 1. which permit discussion of graph embedding. 
They have been presented and rcprovcn many times, we give the statements here. The titles "Slicing lemma" 
and "Embedding lemma" can be found in the literature. 

Lemma 1.4 (Slicing lemma). // {A^B) is a ^-regular pair with density 5 and A' <Z A and B' C B satisfy 
\A'\ > e\A\ and \B'\ > e\B\ for some e > 7, then {A'^B') is a {max.{2, e~^}^)-regular pair with density at 
least (5 — 7 and at most 6 + j. 

Lemma 1.5 (Embedding lemma). For every < rj < 1 and positive integer k there exist 7 = ^Q73|(f?, k) and 
(5 = g| (?7, k) with the following property. 

Suppose that H is a graph with vertices vi, . . . ,Vk, and that Vi, . . . , 14 is a k-tuple of disjoint vertex sets 
such that, for every 1 < i < i' < k, the pair (1^,1/^/) is ^-regular, with density at least 77 if ViVii is an edge 
of H and with density at most 1 — 77 if ViVi' is not an edge of H. Then, at least ^IliLi 1^1 '^f k-tuples 
wi € Vi, . . . ,Wk G 14 span (induced) copies of H where each Wi plays the role of Vi. 

1.2. Multicolor graph version. We call an r-graph on n vertices a pair (V, c) where F is a set of size n 
and c : (^) — > {1, . . . , r} is a function known as the coloring of the edge set. 

For disjoint vertex sets Vi, Vj and a color p £ {1, . . . , r}, we denote ep{Vi,Vj) to be number of edges with 
one endpoint in Vi and the other in Vj and with color p. The p-density of (Vi, Vj) is 

d (V V) - "^(^"^^'^ 

The density vector of the pair {Vi,Vj) is simply 

diV,V,):=idiiV,Vj),...,drm,V,)). 

We say the pair (Vt,Vj) is a 7-regular pair if V- C Vi and V- C Vj such that both \V/\ > "f\Vi\ and \V- \ > 
j\Vj\, then \dp{V/, 1/') - dp{V,Vj)\ < 7 for each pe{l,...,r}. Equivalently, ||d(F/, ^') - d{V,Vj)\\oo < 7- 

A partition (14, . • . , 14) of the vertex set of G, an r-colored graph on n vertices, is said to be a 7-regular 
partition if each of the following holds: 

• ||y,|-|l^,||<lforalH,je{l,...,fc}. 

• All but at most 7/0^ of the pairs {Vi,Vj), 1 < i < j < k are 7-regular. 

The multicolor version of Szemeredi's lemma can be easily derived from a proof outline by Komlos and 
Simonovits [5]: 
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Theorem 1.6 (Szemeredi [^). Fix an integer r > 2. For every e > 0, and positive integer m, there exists 
an integer CM = CM{m, e) with the following property. 

If G is an r-graph with n > CM vertices, and A is an equipartition of the vertex set of G with an order 
not exceeding m, then there exists a refinement B of A of order k, where m < k < CM which is e-regular. 

The classical formulation of Szemeredi's regularity lemma provides only the existence of the e-regular 
partition. However, its proof implies the more precise refinement result we state as Theorem ll.6l In addition, 
the classical formulation of the lemma allows for an exceptional set of size at most en. We can, however, 
apply the original formulation to the graph G with a smaller parameter than e and evenly distribute the 
vertices in the exceptional set among the other clusters to get the result with the given value of e. 

Multicolored graphs have their own Slicing and Embedding lemmas: 

Lemma 1.7 (Slicing lemma). If {A, B) is a ^-regular pair in an r-graph such that (A, B) has density vector 
{di,...,dr) and A' <Z A and B' <Z B satisfy \A'\ > e\A\ and \B'\ > e\B\ for some e > 7, then (A',B') is a 
(m.ax.{2, e~^}j) -regular pair with density vector d — (A',B') such that \dp{A,B) — dp{A' , B')\ < 7 for each 
p G {1, . . . , r} (equivalently, ||d(A, B) - d{A' , B') lU <l)- 

Proof. Let r/ = max{2,e^^}7. We may assume 77 < 1, otherwise the lemma is trivially true as all pairs are 
77- regular whenever rj > 1. In order to verify the regularity of {A\B'), choose A" C A' and B" C B' such 
that \A"\ > ri\A'\ and \B"\ > ri\B'\. Consequently, 

\A"\ > 7]\A'\ > Tje\A\ = max{2e, 1}7|A| > j\A\ 

and similarly, \B"\ > 'j\B\. By the 7-regularity of {A,B), we know that {A",B") has density vector d(y4", _B") 
such that \\d{A,B) - d(A", B")||oo < 7- Moreover, since \A'\ > \A"\ > -f\A\ and \B'\ > \B"\ > -f\B\, then 
\\d{A, B) - d{A' ,B')\\oo < 7- By the triangle inequality, 

\\d{A', B') - d{A", B'Olloo < lld(^, B) - d{A', B')\\^ + \\d{A, B) - d{A" , B")\\^ <2j<7j. 

The arbitrary choice of A" and B" means that {A',B') is 77-regular. □ 

Lemma 1.8 (Embedding lemma). For every < 77 < 1 and positive integer k there exist 7 — ' JJ^ rj, k) and 
5 = ^) with the following property. 

Fix an integer r > 2. Suppose that H = {{vi, ... ,Vk}, c) is an r-graph. Let G be an r-graph. LetVi,...,Vk 
be a k-tuple of disjoint vertex sets of G such that for every 1 <i < i' <k the pair (V^, Vi') is ^-regular, such 
that the density dp{Vi,Vi') > rj if ViVi' is an edge of H with color p, for each p, I < p < k. Then, at least 
S JliLi k-tuples {wi, . . . , Wk) with wi £ Vi , . . . , Wk & Vk span copies of H where each Wi plays the 

role of Vi. 

Note that the case of r = 2 is the case of induced graphs in which edges are color 1 and nonedges are color 

2. 

Proof. We note that r plays no role at all in the definitions of 7 and 5. This is because 77 is the parameter 
that ensures the proper density for all colors. We will choose ^W^rj, k) — min {(77/2)'^"^, (l/G)*^"^}. 

We proceed via induction on k to determine the value of 4i.8|(^7 k). The case of A: = 1 is trivial and 
^j^r/, 1) = 1 for all rj. Let k > 2 and suppose there is such a function 4l.8| (^' ^ ^ 

(1) 7 = min {(77/2)^-1, (1/6)^-1}. 

Consider Vfc. Call a vertex Wk S 14 bad if, for some i £ {1, . . . ,k — 1}, Wk has less than (77 — 7)|T^i| edges 
of color p = c(yk,Vi) incident to it with the other endpoint in Vi. 

Assume that more than 7I Vfc| vertices in T4 are bad and let be the set of bad vertices. Then, dp{Vl, Vi) < 
^''"y'lr^.T'' ^^-"f- On the other hand dp{Vk,Vi) > 77. So {dpiV^, V) - dp{Vk,Vi)\ > 7, contradicting the 
fact that {Vk,Vi) is 7-regular. 

Thus, the number of bad vertices is at most 7 1 Vfc I . Therefore, thare are at most (fc — l)7|V/c| < |Vfe| vertices 
that are bad with respect to some Vi, 1 < i < k ~ 1. Let Wk G 14 be a vertex that is not bad with respect 
to each Vi. Let Vi C Vi he a, set of [(77 — 7)11^^1] vertices Wi such that WiWk has the correct color; i.e., the 
color of ViVk. 

By the Slicing Lemma, each pair {Vi, Vi') for l<i<i'<fc— lis (max {2, (r/ — 7)"^} 7)-regular. The 
pairs also have that dp{Vi, Vi') > r] — j ii ViVi' is an edge of H with color p. 
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In order to apply the inductive hypothesis, we must verify that 
(2) max {2, (v iT^} 7 < TTEi"^ -l,k-l)^ mini 



X fe-2 / , \ fc-2 ' 
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If 77 - 7 > 1/2, then (P gives that 7 = (l/6)'=-i and @ reduces to 27 < (l/6)'=~^ which is true for ah k. 
If 77 - 7 < 1/2 and 77 - 7 > 1/3, then ^ gives that 7 < (1/6)^-'^ and © reduces to 
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77 — 7 

This is true because 7/(77 - 7) < 87 = 3(1/6)''-^ = |(l/6)'=~^ 
If 77 - 7 < 1/3, since 7 < (77/2)''^^, ^ reduces to 

/ \ fc-2 

7 ^ M - 7 



77-7 V 2 

To verify this, see that 

2fc-2^ < 2'^-2(,,/2)fe-i = iri''-^ 



and that 

Some calcuhis shows that (1 — 2^^)^ is increasing for a; > 1 and so we have 

(^_^)fc-i>^fc-i (1 - 2'-^f-' = ^-rf-\ 

as needed. 

Now that we have verified that we can use the inductive hypothesis, we do so and see that the number 
of copies oi H ~Vk in (^i, . . . , Vk-i) at least ^^"q — 7, fc — 1) Y\aZi the total number of copies of 

H is at least 

fc-i 

^l,k-l)\{\V,\-{l-{k- 1)7) \Vk\ 

i=l 

k 

> - 7, - - 7)'-' (1 - (fc - 1)7) n i^^i- 

1=1 

With 7 = min {(77/2)^-1, (l/6)'=-i}, set ^V,k) = 4i^7y - 7, fc - 1)(77 - 7)'=-! (1 - - 1)7), the 
conditions of the Embedding Lemma are satisfied. □ 

1.3. Directed graph version. A digraph is defined to be a pair {V,E) where V is a labeled vertex set, 
E C {V)2 and {V)2 denotes the set F x y — {{v,v) : v e V}. It is convenient for us to view this as a 
coloring. That is, a digraph is a pair (V, c) where c : {V)2 — > {Oj~7'^7~^} is a function known as the 
partial orientation of the edge set. It has the property that, for distinct v,w, 

• c{v, w) — c{w, v) if and only if c(w, w) £ {Qi ^} a-nd 

• c{v, w) if and only if c{w, v) =<—. 

^ y 

For convenience, we denote A {O, —, -^}- Here we interpret the color c{v,'w) = to mean that 
neither (v^w) nor are in E, the color c{v^w) — — to mean that both {v^w) and are in E and 

the color c(w, w) =^ to mean that (w, w) G E and {w, v) ^ E. 

In the directed case, we have the same notions of 7-regular pairs as in the multicolor case. The density 
vector of the pair {Vi,Vj) is somewhat similar as well: 

d{V,,V,) idQ{V,,V,),d_iV,,Vj),dM,Vj),d^{V,,V,)). 

However, in the directed case, the order makes a difference. Although dp{A, B) — dp{B, A) for p G {Q, — }, 
it is also the case that d^(A,B) = d^{B,A). 



Alon and Shapira give the following version of Szemeredi's lemma: 

Theorem 1.9 (Alon-Shapira P]). For every e > and positive integer m, there exists an integer DM = 
DAI{m, e) with the following property. 

If G is a digraph n > DM vertices, and A is an equipartition of the vertex set of G with an order not 
exceeding m, then there exists a refinement B of A of order k, where m < k < DM which is e-regular. 

Digraphs have their own Slicing and Embedding lemmas: 

Lemma 1.10 (Slicing lemma). If {A, B) is a j -regular pair in a digraph such that {A, B) has density vector 
{dQ,d^,d^,d^) and A' C A and B' C B satisfy \A'\ > e\A\ and \B'\ > e\B\ for some e > 7, then {A',B') 
is a (max{2, e^^}j) -regular pair with density vector d' := (c^'q, d'_, d'^, d'_^) such that \dp — rfp| < 7 for each 
{O,-,^,^} (eqmvalently ||d(yl, B) - d(^', S')||oo < l). 
The proof is identical to the multicolor case, Lemma fTTTl 

Lemma 1.11 (Embedding lemma). For every < < 1 and positive integer k there exist 7 = Ijl.ni Vi ^) 
and S — ^J^JJp], k) with the following property. 

Suppose that H is a digraph with vertices vi, . . . ,Vk, and that Vi, . . . ,Vk is a k-tuple of disjoint vertex sets 
of G such that for every 1 < i < i' < k the pair (Vj,!^/) is ^-regular, such that the density dpiVi^Vii) > rj 
if {vi,Vi>) is an edge of H with color p. Then, at least (^HiLi Wi\ of the k-tuples {wi, . . . ,Wk) with wi G 
Vi, . . . ,Wk € Vk span (induced) copies of H where each Wi plays the role ofvi. 

Again, the proof is identical to the multicolor case. Lemma [1.71 

1.4. Main results. The statement of the main result (Theorem I1.12p can be made in general with the 
definitions above. Recall that d(Vi, Vi') denotes the density vector of the pair {Vi, Vi'). 

Theorem 1.12 (Alon, et al. [Tl). Fix r > 2. For every m and function £ with £ : N — >■ (0, 1), there exist 
S = jg| (r, TO, £) and 5 jg| (r, m, £) with the following property: 

If G is a graph [r-graph, digraph] with n > S vertices then there exist an equipartition A = {Vi : 1 < « < 
k} of G and an induced subgraph [induced r-subgraph, induced subdigraph] G' of G, with an equipartition 
A' = {V[ : 1 < i < k} of the vertices of G' that satisfy: 

• S > k > m. 

• VI C V for all i > 1, and \V[\ > Sn. 

• In the equipartition A', all pairs are £{k)-regular. 

• All but at most £{0)i^^ of the pairs \ < i < i' < k are such that ||d(V^, Vi') - d(V/, ^/OIU < '^(0). 

Our contribution is to prove the case for multicolored graphs and digraphs. Although the proof is quite 
similar to that of N. Alon, E. Fischer, M. Krivelevich and M. Szegedy [1], there are subtleties that need to 
be addressed. 

2. Proof of the main results 

There is a plethora of lemmas that are required to prove our main result. Lemma 12.21 is a consequence 
of the defect form of the Cauchy-Schwarz Inequality which is stated without proof and can be found in [6J. 
Corollary 12.31 is a direct consequence of Lemma 12.21 Lemma 12.41 is a refinement lemma that allows the 
induction to take place and Lemma 12.51 is the main lemma, of which our main result. Theorem 11.121 is a 
direct consequence. 

First, we need a definition which, in the context of multicolorings of the complete graph, comes from [5]. 

Definition 2.1. Given an equipartition A — {Vi '■ 1 < i < k} of the vertex set of a multicolored graph 
[digraph], we define the index of A as follows: 

p l<i<i'<k 

where in the case of multicolored graph s, ^ runs over all colors and in the case of digraphs, the colors p 
run over the set of four "colors" in the set A ~ {Q,—,^,^}. 

Note also that ind(^) - ^ Ei<.<.'<fc Ep 1^.') < wJ:i<.<^'<k (E, rfp(^., ^.'))' < i 
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Lemma 2.2. For all sequences of nonnegative numbers Xi, . . . , X„, if for some m, 1 < m < n 

m n 

fc=i fe=i 

then 

2 

„2 



k=l \k=l / 



an 



m(n — m) 



(observe that a need not be positive). 

Corollary 2.3. Suppose that A and B are two disjoint sets of vertices of a multicolored graph [digraph] G, 
and {Aj : 1 < j < ^} and {Bj '■ 1 < j < i} are their two respective partitions to sets of equal sizes, such 
that, for some color p, at least el"^ of the possible satisfy \dp{A,B) — dp{Aj , Bji)\ > ^e. Then, 

J2 dl{A„B,,)>fU{A,B) + y- 

Proof of Corollary 12.31 Under the above conditions, either at least ^e£^ of the pairs are such that 
dp{Aj,Bj,) —dp{A, B) > ie, or at least ^ei"^ are such that dp{Aj,Bj>) —dp{A, B) < —\e. We use Lemma 
with n = m — |e^^, and a satisfying |a| > \e^P- Furthermore, we use the fact that all \Aj\ — \A\/l and 
all \B.y\^\B\/i to obtain 

dp{A,,By)^ed{A,B). 

i<j,j'<e 

Applying Lemma 12.21 to the sequence {dp{Aj , Bj>)}^^^ j,^^, we obtain 

^T^JM^^B,) > fd%A,B) + ,^^,[f!\^^,^ > {dUA,B) + 
as required. □ 



Lemma 2.4. Suppose that A — {Vi : 1 < i < k} and its refinement B ~ {Vi.j '■ \ < i < k,l < j < C\ be vertex 
partitions of a graph G, satisfying ind(B) — ind(^) < -^re^ for some e, and that the number of vertices of the 
graph is n > 512e^'^rk£. Then, for all possible i < i' but at most e(^^ of them, \dp{Vi,Vi')~dp{Vij,Vi',j')\ <e 
holds simultaneously for all p, for all but a maximum of e£^ of the possible 

Proof of Lemma 12.41 Supposing the contrary and assuming e < 1 and fc > 1 , we show that the index of 
B is larger than that of A by more than ^re^. If not all of the sets of B are of exactly the same size, let V[ j 
be Vij for sets of the smaller size and F/^- be Vij minus an arbitrarily chosen vertex for sets of the larger 
size. Defining also V- = lJi<j<€ ^j"' define two new partitions B' = {V/j : 1 < i < k,l < j < £} and 
A' — {V- : 1 < i < fc} of a large induced submulticolored graph [subdigraph] of G (for each of these new 
partitions all its sets are of the same size) . The assumption on n implies that \dp{Vi,Vi')-'dp{V/ ,V-,)\ < ^ e** 
and \dp(yij,Viij') — dp{Vlj,V-, ,i,)\ < -^e^ hold for all i,j,i',j',p. In particular, |ind(^) - ind(^')| < 
Y^e^ and |ind(6) — ind(;B')| < ^e-^ hold, and for more than e(^) of the possible i < i' , the inequality 
\dpiyi, VI,) - dp{Vlj,V[ j,)\ > e - > \e holds for at least of the possible j, j'. Using Corollary O 

we obtain 

P 1 <i < i' < k 

1 < jJ' < e 

p y i<i<i'<k ^ ^ J 

This implies ind(i3) — ind(^) > ind(B') — ind(^') — ■^re'^ > -^re^, completing the proof. □ 
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The main lemma is Lemma 12.51 



Lemma 2.5. Fix a positive integer r. For every integer m and function £ with £ : N (0, 1), there exists 
a number S — j| (r, m, £) with the following property. 

If G is an r-graph [digraph] with n > S vertices, then there exists an equipartition A — {Vi : I < i < k} 
and a refinement B — {Vij '■ 1 < i < I < j < £} of A that satisfy: 

• \A\=k>m but \B\ = k£< S. 

• For all 1 < i < i' < k but at most £{0)(^^^ of them, the pair {Vi, Vi') is £{0)-regular. 

• For all 1 < i < i' < k and all 1 < j,j' < £ but at most £(k)£'^ of them, the pair {Vj ,Vi' is 
£{k)-regular. 

• All 1 < i < i' < k but at most £{0)(^^^ of them are such that for all 1 < j,j' < £ but at most £{0)£'^ 
of them \dp{Vi, Vi') — dp{Vij, Vi'j')] < £{0) holds for each p G {1, . . . , r}. 

Proof. We may assume that m > 1 and that £{k) is monotone nonincreasing. For convenience, let e — £{0). 

If we are in the case of a multicolored graph, fix a positive integer r, and using the function CM from 
Theorem ll.6[ let 

T(i) = CM{r,m,e) 

and for i > 1, we define by induction 

If we are in the case of a digraph, and using the function DM from Theorem II. 9[ let 

T(i) = DM{m,e) 

and for i > 1, we define by induction 

rW = DM{T^'-^\2£{T^'-^^){T'^'-^y)-^). 
In either case, we show that S = 512re-'^T(64'^'= +i) satisfies the required property. 

Given G, define Ai to be an equipartition of order at least m but not greater than T^^\ such that all 
pairs but at most of them are e-regular. Define by induction for i > 1 the equipartition Ai to be a 

refinement of Ai-i, of order not greater than T*^*^ such that all of the pairs but at most 

2^(j.(.-i))(y(.-i))-2|^IA|^ < 2f(r('-i))(|A-i|)-'(''^*') 

are 2f (T^*^^^)(T^*^^')^^ < f (T^'^^^)-regular. The refinements are guaranteed by the original regularity 
lemma, either Theorem 11.61 (in the multicolor case) or Theorem 11.91 (in the digraph case) . 

Let us now choose the minimum i such that ind(,4i) — ind(^i_i) < ^re^ . There certainly exists such 
an 1 < i < 64r~^e~''' + 1 since the indices of each partition in the series are all between and 1. We set 
A = Ai-i and B = Ai, and appropriately k — \Ai-i\ = \A\ and I = /c^^|yli| = We claim that A 

and B are the required partitions. 

It is clear that B is a refinement of A and that they both satisfy the requirements with regards to their 
respective orders. It is also clear (by the assumption £{k) < £{0) = e) that A satisfies the requirement 
regarding the regularity of its pairs. Since all but at most 2£{k)k~'^ [^^^ < £{k)£'^ of all the pairs of B are 
£(fc)-regular, the condition regarding the regularity of pairs of B in the formulation of the lemma follows. 
Finally, Lemma 12.41 shows that most densities of the pairs of B differ from the corresponding densities of the 
pairs of A by less than e, as in the formulation of the last condition of this lemma. □ 

Proof of Theorem [1TT21 We may assume £{n) < £{0). Set e = £{0). Define £' by setting £'{k) 
= min|£:(K), je,^Ct^y^}^ set S = ^^r,m,£') and 5 = ^S^m, £'))-'^ . Use Lemma[23]on G, finding 
the appropriate partitions A = {Vi : 1 < i < k} and B = {Vij : 1 < i < k,l < j < £}. 

Now choose randomly, independently and uniformly ji such that 1 < ji < £ for each 1 < i < fc. With 
probability more than 1/2, all the pairs {Vij^, Vi'j.,) are f (fc)-regular. In fact, the probability that there is 
some pair that is not £(fc)-regular is at most £{kyC^). 
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Moreover, the expected number of pairs 1 < i < i' < k for which \dp{Vi,Vi') — dp{Vij^,Vi' > e for 
some p is no more than ^£{2) + 3^(2) ~ 5^(2)' choice of 6' , so with probability at least 1/2, no more 

than 6(2) of the pairs satisfy this. 

Therefore, there exists a choice of ji, . . . ,jk such that all pairs {Vij^^Vi' j.,) are f (fc)-regular, and all but 
at most £(2) of them satisfy \dp{Vi,Vi') — dp{V- , V-,)\ < e for all p G {1, . . . , r}. Defining G' as the induced 
subgraph spanned by Ui<i<fc ^iji' ^y setting Vi = Vij. achieves the required result. □ 

3. Application 

An important feature of editing is the notion of the palette. Colloquially, the palette is the set of colors 
to which an edge can be changed. For an r-graph, the palette is always the set {1, . . . ,r}. Note that if 
r — 2, this is the case of simple graphs. So, we will not define the palette for r-graphs, only focusing on it 
for digraphs. 

^ ^ 

Definition 3.1. In the case of digraphs, we say that PC A is a palette if either none or both of "— and 
" are in V and every digraph is a pair {V,c) where V is a vertex set and c : {V)2 V is a coloring of 
the edge set of a complete graph on \V\ vertices. There are 5 possible nontrivial palettes: 

^ y 

(0) Vq = A is the most general case. 

(1) Vi = { — is the case of simple digraphs such that every pair of vertices has at least one arc 
between them. 

(2) V2 = {Oi ^1 ^} '^'^•56 of oriented graphs; that is, no pair of vertices has two arcs between 
them. 

(3) V3 = {O?^} case of simple, undirected graphs. 

(4) V4 = {^, — >} is the case of tournaments. 

Recall that the vector is of the form {p, q) where p,q > and < 1 — p — 2q. In the cases in which the 
palette is not A , the relevant density vector must be further restricted. 
^ In the case of Vi = {-, — >}, then p + 2q ~ 1. 
© In the case of P2 = {O, ^, then p = and q < 1/2. 

([3]) In the case of V3 = {Q, — }, then q = and p < 1. This is the r-graph case where r = 2 or simply 

the case of undirected graphs. See [3] and [4]. 
(gl) In the case of Vi = {^, ^}, then p = and 1 - p - 2q = 0, so q = 1/2. 
Our application is one of edit distance and it shows that r-types [dir-types] are used to lower bound the 
edit distance function. It turns out that, trivially, they upper bound the edit distance function. 

Definition 3.2. An r-type, K, is a pair {U, (j)), where U is a finite set of vertices and (j) : UxU — > 2^^'-'''^\0, 
.such that 4i{x, y) — (j^iy, x) and 4>{x, x) ^ {1, . . . , r}, for all x,y U . Informally, we will view an r-type as a 
complete graph with a coloring of both vertices and edges using subsets of {1, . . . ,r}. The sub-r-type of K 
induced by W U is the r-type achieved by deleting the vertices U — W from K . 

We say that an r-graph H = (V, c) of a complete graph embeds in type K — (U, <j)), and write H 1— >■ K , 
if there is a map 7 : — !■ [/ such that c{{v,v'}) — cq implies cq £ (j){j{v),j{v')). 

Types are defined in a slightly different way for digraphs. 

^ y 

Definition 3.3. Let V A be a palette. A V-dir-type or simply dir-type where the palette is understood, 
K , is a pair {U, (f)), where U is a finite set of vertices and (j) : U x J7 — )■ 2^^ \ 0, such that 

(1) for distinct x, y and p G {Q? ~}i ^(s;, y) 3 p if and only if (j){y, x) 3 p and 

(2) (j){x,y) 9— > if and only if (l){y,x) 3^. 

Moreover, for all x G U, (j){x,x) is a nonempty proper subset of V . The sub-dir-type of K induced by 
W QU is the dir-type achieved by deleting the vertices U ~W from K . 

We say that a directed graph H — {V,c) embeds in type K = {U,<j)), and write H i— > K, if there is 
a map ^ :V ^ U such that, for distinct u,u' £ U, c{v,v') G <j){u,u') whenever 7(w) — u and "f{v') = u' 
and for u £ U, the following occurs: (1) if exactly one of {<—,—>} is in (j){u,u), then the oriented edges of 
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j~^{u) are a subdigraph of a transitive tournament (2) if neither <— nor — !■ is in (j){u,u), then 'y~^{u) has 
no oriented edges, (3) if Q) u), then j^^(u) has no nonedges and (4) if — ^(j){u,u), then j^^{u) has 

no undirected edges. 

We define the set of types IC{'H) tliat we need to consider for tliis problem. 

Definition 3.4. Let % he a hereditary property of r- graphs [digraphs]. We use the notation J-{'H) to he the 
minimal set of r- graphs [digraphs] such that T-i = HifeJ^CH) Forb(_ff), where Forb(_ff) denotes the property of 
having no induced copy of H . 

We also denote 1C{'H) to he the set of all r-types [dir-types], K , such that H ^/^ K for all H e J-([H). 

The fx function is what we use to compute the edit distance. 

Definition 3.5. For an r-type, K = {U,c) on k vertices, and a density vector p = (pi, . . . ,Pr), we define 
the function /k(p) as follows: For p = l,...,r, let the matrix Ap he such that the {i,jY^^ entry is 1 if 
c{ui,Uj) 3 p and zero otherwise. If 3 denotes the kx k all-ones matrix, 1 denotes the k x 1 all-ones vector, 
then 



k^ 

The /if function is defined in a slightly different way for digraphs. 

Definition 3.6. For a dir-type, K — (U,c) on k vertices, and a density vector p = (j),q), we define the 
function /a'(p) o.s follows: For p = Q, — , let the matrix Ap be such that the («, j)*^ entry is 1 if c{ui, Uj) 3 p 
and zero otherwise. The matrix has the property that the (i, j)"^ entry is 

1, if c{ui,Uj) contains exactly one member o/{^,— >}; 

2, if c{ui,Uj) D {^,— >}; and 
0, otherivise. 

If J denotes the k x k all-ones matrix, 1 denotes the k x 1 all-ones vector, then 

fK{p) - (J - (1 - p - 2g)Ao - pA_ - qA^) 1. 

The entry of 2 is necessary in order to account for the fact that fewer editing operations are required if 
both directions are permitted rather than simply one direction. 
Finally, some definitions with respect to edit distance: 

Definition 3.7. For r-graphs [digraphs] G — {V,c) and G' = {V,c') on the same labeled vertex set, the 
expression dist(G', G") counts the number of pairs of vertices v,v' such that c{v,v') ^ c'{v,v'). 
The distance of G from T-L is min{dist(G, G") : G' G T-L}. 

We need to express the main application differently in the case of r-graphs and digraphs. However, only 
the r-graph version will be proven. 

Theorem 3.8. Let G' he an r-graph in hereditary property Ti = ClneJ^in) Forb(i?) and p = (pi, . . . ,pr) be 
a probability vector. Then, there exists an r-type K G K,{'H) such that H i/^ K for all H G J'i'H) and with 
probability going to 1 as n ^ oo, dist(G'„^p, > fK{p){^) ~ o{n^). 

Theorem 3.9. Let G' he a digraph in hereditary property % = Qji^gjrj^) Forb(if) and p = {p,q) he a 
probability vector. Then, there exists an dir-type K G IC([H) such that H K for all H G J'i'H) and with 
probability going to 1 as n oo, dist(G'„_p, > fK{p){^) — oin^). 

Proof. Fix 77 ^ (5 ^ e > 0. Let G be distributed according to G„,p and G' G H be a graph of distance 
dist(G, from G. Apply Theorem II. 121 with m = and any decreasing function £ for which £{0) = e to 
G' and consider the partition A' ~ (V/, . . . , ¥[.). Construct the r-type [dir-type] — {U, cq) on vertex set 
U = {ui, . . . ,Uk} as follows. For distinct z, j, co(ui, Uj) 9 p if and only if the pair (V^ , V]) is i?(A:)-regular 
such that the color p occurs with density at least 5. 

Now, we shall define cq on the vertices; i.e., co{ui,Ui), Ui G U, such that G IC{T-L). Assume no such 
assignment to the vertices exists; i.e., for any choice of colors of the vertices, there exists an H G J^iH) 
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for which H Kq. Apply the regularity lemma ^Theorem 11.61 in the r-graph case or Theorem 11.91 in the 
digraph case) to each of the clusters ¥( and use Ramsey theory find a clique of miniclusters that are regular 
with positive density in the same color. Assign that color to Ui to complete the definition of Kq. Using the 
relevant slicing and embedding lemmas, we see that if iJ t-^ Kq, then there is an induced copy of iJ in G', a 
contradiction. (See the authors and Kezdy U for details in the graph case.) 

As to counting the number of changes, for all distinct z < i', it is the case that dc ,p{Vi,Vi') — for all 
p ^ co{vi,v'i). By Theorem I1.12[ we can look at the equipartition A and see that for all but £{0)k'^ such 
pairs, dG'_p{Vi,Vi') < £{0) for all p co(ui,u-). Now consider the equipartition A as applied to G. We see 
that 



dist(G,G') > J2 J2 idGAV,,V,,)-£{0))\Vm'\-£{0)k^ 

l<i<i'<k p^coim-u^i) 

1 2 



l<i<i' <k p^CQ{ui,u^/) 



k 



2 /k 



£{0)k^ 



A routine Chernoff bound computation shows that, since k is bounded, if 1 < i < i' < k, then the 
probability that dG,p{Vi, Vi') < pp — ln/k\~^/^ is at most exp {— 2[n/fcJ^/'^}. Given an equipartition of V of 
order fc, the probability that there exists some pair {Vi,Vi') and some p & {1, . . . , r} such that dG,p{Vi,Vi') < 
Pp — \ n/k\^^^^ is at most rC^") exp 2[n/fcJ ^/'^}. The number of equipartitions, disregarding the labeling 



of the vertices, is bounded by a function of 5 = %3]('''^(0) 
equipartition with one such pair is 0(exp{-2(n/S'p3}). 
So, with that probability, and the fact that £{0) = e. 



, £) . Hence, the probability of having an 



dist(G, G') 



> 



E E 
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Since k > m = e ^, we can see that the error term in ([3]) is O (ren^ 
that dist(G„,p, H) > fKip)^) ~ goes to 1 as n — > cxd. 



So, for any 77 > 0, the probability 

□ 
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